The DC-Tree: A Fully Dynamic Index Structure for Data Warehouses
نویسندگان
چکیده
Many companies have recognized the strategic importance of the knowledge hidden in their large databases and have built data warehouses. Typically, updates are collected and applied to the data warehouse periodically in a batch mode, e.g., over night. Then, all derived information such as index structures has to be updated as well. The standard approach of bulk incremental updates to data warehouses has some drawbacks.First, the average runtime for a single update is small but the total runtime for the whole batch of updates may become rather large. Second, the contents of the data warehouse is not always up to date. In this paper, we introduced the DC-tree, a fully dynamic index structure for data warehouses modeled as a data cube. This new index structure is designed for applications where the above drawbacks of the bulk update approach are critical. The DC-tree is a hierarchical index structure similar to the X-tree exploiting the concept hierarchies typically defined for the dimensions of a data cube. The DC-tree uses minimum describing sets and the partial ordering of the attribute values induced by the concept hierarchies instead of minimum bounding rectangles and an artificial total ordering. Furthermore, for each minimum describing set in the directory the values of the measure attributes are materialized. We conducted an extensive experimental performance evaluation using the TPC-D benchmark data. Our results demonstrate that the DC-tree yields a significant speed-up compared to the X-tree and the sequential search when processing general range queries on a data cube.
منابع مشابه
Dynamic cubing for hierarchical multidimensional data space
Data warehouses are being used in many applications since quite a long time. Traditionally, new data in these warehouses is loaded through offline bulk updates which implies that latest data is not always available for analysis. This, however, is not acceptable in many modern applications (such as intelligent building, smart grid etc.) that require the latest data for decision making. These mod...
متن کاملارائه روشی پویا جهت پاسخ به پرسوجوهای پیوسته تجمّعی اقتضایی
Data Streams are infinite, fast, time-stamp data elements which are received explosively. Generally, these elements need to be processed in an online, real-time way. So, algorithms to process data streams and answer queries on these streams are mostly one-pass. The execution of such algorithms has some challenges such as memory limitation, scheduling, and accuracy of answers. They will be more ...
متن کاملMalmquist Productivity Index with Dynamic Network Structure
Data envelopment analysis (DEA) measures the relative efficiency of decision making units (DMUs) with multiple inputs and multiple outputs. DEA-based Malmquist productivity index measures the productivity change over time. We propose a dynamic DEA model involving network structure in each period within the framework a DEA. We have previously published the network DEA (NDEA) and the dynamic DEA ...
متن کاملThe Dimension-Join: A New Index for Data Warehouses
There are several auxiliary pre-computed access structures that allow faster answers by reading less base data. Examples are materialized views, join indexes, B-tree and bitmap indexes. This paper proposes dimension-join, a new type of index especially suited for data warehouses. The dimension-join borrows ideas from several concepts. It is a bitmap index, it is a multi-table join and when bein...
متن کاملA local measurement-based protection scheme for DER integrated DC microgrid using Bagging Tree
In recent years, DC microgrid has attracted considerable attention of the research community because of the wide usage of DC power-based appliances. However, the acceptance of DC microgrid by power utilities is still limited due to the issues associated with the development of a reliable protection scheme. The high magnitude of DC fault current, its rapid rate of rising and absence of zero cros...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000